vast amount
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content
Large Language Models (LLMs) are trained on vast amounts of data, most of which is automatically scraped from the internet. This data includes encyclopedic documents that harbor a vast amount of general knowledge (, Wikipedia) but also potentially overlap with benchmark datasets used for evaluating LLMs. Consequently, evaluating models on test splits that might have leaked into the training set is prone to misleading conclusions. To foster sound evaluation of language models, we introduce a new test dataset named RepLiQA, suited for question-answering and topic retrieval tasks. RepLiQA is a collection of five splits of test sets, four of which have not been released to the internet or exposed to LLM APIs prior to this publication. Each sample in RepLiQA comprises (1) a reference document crafted by a human annotator and depicting an imaginary scenario (, a news article) absent from the internet; (2) a question about the document's topic; (3) a ground-truth answer derived directly from the information in the document; and (4) the paragraph extracted from the reference document containing the answer. As such, accurate answers can only be generated if a model can find relevant content within the provided document. We run a large-scale benchmark comprising several state-of-the-art LLMs to uncover differences in performance across models of various types and sizes in a context-conditional language modeling setting.
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content
Large Language Models (LLMs) are trained on vast amounts of data, most of which is automatically scraped from the internet. This data includes encyclopedic documents that harbor a vast amount of general knowledge (e.g., Wikipedia) but also potentially overlap with benchmark datasets used for evaluating LLMs. Consequently, evaluating models on test splits that might have leaked into the training set is prone to misleading conclusions. To foster sound evaluation of language models, we introduce a new test dataset named RepLiQA, suited for question-answering and topic retrieval tasks. RepLiQA is a collection of five splits of test sets, four of which have not been released to the internet or exposed to LLM APIs prior to this publication.
Four reasons to be optimistic about AI's energy usage
"Dollars are being invested, GPUs are being burned, water is being evaporated--it's just absolutely the wrong direction," says Ali Farhadi, CEO of the Seattle-based nonprofit Allen Institute for AI. But sift through the talk of rocketing costs--and climate impact--and you'll find reasons to be hopeful. There are innovations underway that could improve the efficiency of the software behind AI models, the computer chips those models run on, and the data centers where those chips hum around the clock. Here's what you need to know about how energy use, and therefore carbon emissions, could be cut across all three of those domains, plus an added argument for cautious optimism: There are reasons to believe that the underlying business realities will ultimately bend toward more energy-efficient AI. The most obvious place to start is with the models themselves--the way they're created and the way they're run.
Could AI robots replace human astronauts in space?
Technology can play a part in complementing human space travel by freeing up astronauts from certain tasks to allow them to focus on more important research. "[AI could be used to] automate tedious tasks," explains Dr Kiri Wagstaff, a computer and planetary scientist in the US who previously worked at Nasa's Jet Propulsion Laboratory in California. "On the surface of a planet, humans get tired and lose focus, but machines won't." The challenge is that vast amounts of power are needed to operate systems like large language models (LLM), which can understand and generate human language by processing vast amounts of text data. "We are not at the point of being able to run an LLM on a Mars rover," says Dr Wagstaff.
Google goes NUCLEAR: Tech giant will use nuclear reactors to generate the vast amounts of energy needed to power its AI data centres
With its Gemini chatbot and Pixel AI phone software, it's fair to say Google has an obsessive focus on artificial intelligence. But all that advanced computational power requires millions of computers, known as'servers', housed inside data centres across the world that operate 24/7. Now, in an attempt to cater to its vast AI needs, Google is going nuclear. The tech giant has signed a deal with California-based nuclear firm Kairos Power to build new nuclear reactors to supply its US data centres with energy. Although the location of these reactors is yet to be revealed, Google said the first will be operational in 2030, with more to follow by 2035.
Apple Intelligence is coming. Here's what it means for your iPhone
Artificial intelligence (AI) is coming to your iPhone soon and, according to Apple, it's going to transform the way you use your device. Launching under the brand name "Apple Intelligence" the iPhone maker's AI tools include a turbocharged version of its voice assistant, Siri, backed by a partnership with ChatGPT owner OpenAI. The Guardian's journalism is independent. We will earn a commission if you buy something through an affiliate link. The technology is already available on smartphones including Google's latest Pixel and Samsung's Galaxy range. Yet the vast amounts of data needed by AI are leading to concerns about data privacy.
DNA computer can play chess and solve sudoku puzzles
A computer made from DNA that can solve basic chess and sudoku puzzles could one day, if scaled up, save vast amounts of energy over traditional computers when it comes to tasks like training artificial intelligence models. DNA devices have a number of potential advantages, such as being able to safely store vast amounts of information, in microscopically tiny volumes, for millennia.
The Guardian blocks ChatGPT owner OpenAI from trawling its content
The Guardian has blocked OpenAI from using its content to power artificial intelligence products such as ChatGPT. Concerns that OpenAI is using unlicensed content to create its AI tools have led to writers bringing lawsuits against the company and creative industries calling for safeguards to protect their intellectual property. The Guardian has confirmed that it has prevented OpenAI from deploying software that harvests its content. Generative AI technology โ the term for products that generate convincing text, image and audio from simple human prompts โ has dazzled the public since a breakthrough version of its ChatGPT chatbot launched last year. However, fears have arisen about the potential mass-production of disinformation and the way in which such tools are built.
'It's not like science fiction any more': Nasa aiming to make spaceships talk
Now Nasa engineers say they are developing their own ChatGPT-style interface that could ultimately allow astronauts to talk to their spacecraft and mission controllers to converse with artificial intelligence-powered robots exploring distant planets and moons. An early incarnation of the AI is slated to be deployed on Lunar Gateway, a planned extraterrestrial space station that is part of the Artemis programme, according to the engineer developing the technology. "The idea is to get to a point where we have conversational interactions with space vehicles and they [are] also talking back to us on alerts, interesting findings they see in the solar system and beyond," Dr Larissa Suzuki, a visiting researcher at Nasa said. Speaking at a meeting on next-generation space communication at the Institute of Electrical and Electronics Engineers (IEEE) in London on Tuesday, Suzuki outlined an interplanetary communications network with inbuilt AI to detect, and possibly fix, glitches and inefficiencies as they occur. "It then alerts mission operators that there is a likelihood that package transmissions from space vehicle X will be lost or will fail delivery," she said.
Which Jobs Will GPT-4 Replace First?
Artificial Intelligence is once again ready to swipe jobs from under our noses! This time, a Twitter user asked our very own ChatGPT to list out which jobs GPT-4, would replace. Oh boy, things are heating up now that GPT-4 is out and about! With its impressive capabilities, it's no wonder that people are feeling more and more insecure about their jobs. But hey, at least ChatGPT will continue to improve and become even more of a linguistic powerhouse -- whether that's good news for us or not is up for debate!